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Abstract 

One of the characteristic features of genetic networks is their inherent robustness, that is, their 
ability to retain functionality in spite of the introduction of random errors. In this paper, we seek 
to better understand how robustness is achieved and what functionalities can be maintained robustly. 
Our goal is to formalize some of the language used in biological discussions in a reasonable mathematical 
framework, where questions can be answered in a rigorous fashion. These results provide basic conceptual 
understanding of robust regulatory networks that should be valuable independent of the details of the 
formalism. 

We model the gene regulatory network as a boolean network, a general and well-established model 
introduced by Stuart Kauffman. A boolean network is said to be in a viable configuration if the node 
states of the network at its fixpoint satisfy some pre-specified constraint. We define how mutations affect 
the behavior of the boolean network, and we say a network is robust if most random mutations of the 
model reach a viable configuration. 

We first study the case when the boolean network is specified by a directed acyclic graph. Random 
mutations induce a neighborhood around the configuration that would be reached if there were no 
mutations. We show that for the case of acyclic networks, this neighborhood is a bijective transformation 
of the usual Hamming neighborhood. A robust acyclic network chooses the bijection so that most of 
the neighborhood lies in the space of viable configurations. The greater the degree of the network, the 
more complex the bijection is allowed to be and thus the greater the possibility of robustly satisfying 
constraints. 

Next, we study networks where directed cycles are present. We show that cyclic networks can make 
the volume of the neighborhood smaller, by mapping different errors in the network to the same final 
configuration. Thus, cyclic networks can be dramatically more powerful with respect to robustness than 
acyclic networks. Also, we explicitly describe a large class of constraints for which cyclic networks provide 
robustness. 

Keywords: robustness; regulatory networks; mathematical modelling; boolean networks 



1 Introduction: Understanding Robustness 

One of the hallmark features of life is its diversity. To solve the same basic problems of survival and 
reproduction, nature has devised a scintillating array of solutions, each remarkably different from others in 
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many aspects. In the long term, such biological innovation is necessary, because changing environmental 
conditions mean that some solutions will become defunct while others will become more advantageous. The 
fundamental question that arises then is how docs innovation in biology arise? 

To this end, Wagner in [Wag05] defined a biological system to be evolvable "if it can acquire novel 
functions through genetic change, functions that help the organism survive and reproduce." What exactly 
arc the features of a system that displays evolvability remain a mystery. The question of evolvability can 
also be phrased in terms of the genotype-phenotype map. The genotype of an organism is the hereditary 
information contained in the genome, while the phenotype is the set of properties actually exhibited by 
the organism and acted upon by natural selection. For evolvability, the map between the genotype and 
phenotype must be such that random mutations to the genotype can possibly "improve" the phenotype 
so that novel functionality is acquired. In nature, this map is not the identity map: there is a complex 
translation process that constrains the phenotypcs that can be expressed while also encouraging variability. 
One can ask then for the properties of the genotype-phenotype map for evolvable biological systems. 

In this paper, we rigorously study one specific aspect of evolvability: robustness. At a first glance, 
robustness and evolvability seem to be diametrically opposed concepts. Robustness refers to a system's 
ability to retain functionality in the presence of changes, while evolvability refers to a system's ability to 
acquire new functionality. To resolve this dilemma, let us define robustness more precisely, following [Wag05] . 
Given a specific phenotypic property /, we say a mutation to the genotype is neutral with respect to f if 
that mutation does not affect possession of the property /. We say a system is robust with respect to f 
if the vast majority of mutations are neutral with respect to /. For instance, the phenotypic property 
to be preserved could be the RNA secondary structure, which a prerequisite for RNA function. Then 
genetic change in an RNA molecule that is neutral with respect to RNA secondary structure would preserve 
the RNA's secondary structure but potentially change other aspects. Another example (also discussed in 
[Wag05]) is cryptic variation in developmental genes. These mutations preserve the development of complex 
organs, such as the eye and legs, under usual circumstances but could drastically alter their development in 
alternate environments [RL98]. In this case, the property being preserved is development of these organs in 
a specific environmental and genetic background. 

Now, we can explain why robustness with respect to a given phenotypic property increases evolvability. 
If a system is robust with respect to some primary property /, then, since most mutations are neutral 
with respect to /, the system can express many phenotypes satisfying / and thus has a higher chance of 
encountering a phenotype that satisfies some other property g. Thus, novel phenotypes can be discovered 
while not destroying previous functionalities already achieved. Gould called this process exaptation ([GV82]) 
to refer to organismal features that become adaptations to new conditions long after they have already 
arisen to achieve some more basic functionality. In other words, robustness allows a system to accumulate a 
reservoir of neutral mutations and thus has a greater potential for innovation with respect to new unexplored 
functionalities. 

Evidence for robustness with respect to particular phenotypic properties is abundant throughout nature. 
Although we describe robustness as resilience to mutations in the genome, similar notions also exist for 
changes at all levels of organization. Proteins tolerate thousands of amino acid changes, metabolic networks 
continue to function after removal of intermediate steps, gene regulatory networks are unaffected by alteration 
of gene interactions, genetic changes in embryonic development often hardly affect the viability of the adult 
organism, and microbes and higher organisms can tolerate complete elimination of many genes. Organization 
in biological structures is incredibly complex, and it is often a matter of great mystery how such robustness 
can be achieved at all. 

This paper is part of an effort to better understand the robustness property and the environments 
under which robust solutions are possible. Our goal is to formalize some of the language used in biological 
discussions in a reasonable mathematical framework, where questions can be answered in a rigorous fashion. 

To simplify the discussion, let us only look at the case of robustness of the phenotype to mutations 
in the genotype (although much of our results can potentially be applied to robustness at other levels of 
organization). The genotype takes the form of a regulatory gene network. The expression level of each gene 
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is functionally related to the expression level of some other genes. Thus, if the expression level of one gene 
is changed, expression levels of other genes are also modified according to the regulatory connections. To 
model these networks we quantize gene expressions by two levels, ON and OFF and describe their interaction 
via boolean networks, introduced by Kauffman [Kau69, Kau93, Kau95]. This is a very general and well- 
established model. The phenotype expressed by a particular genotype (represented by the regulatory net- 
work) is described by the stable configurations of the network; this correspondence has has been verified using 
simulation and evaluating gene expression data [AO03, MABOO, ESPLAB04, HIOO, HEBYI05, GdBLC03]. 

In this paper we take this widely used biological model and try to identify, describe and understand 
conditions under which robust gene networks can work - i.e. which conditions can be robustly satisfied 
by regulatory networks. We furthermore want to investigate how such robust solutions can be found - 
an everyday task of nature. We believe that answers to these questions arc essential for understanding 
the biology of evolution, the power and structure of regulatory (gene) networks and their interaction with 
mutations. 

Organization. We continue in Section 2 by giving formal definitions for our model and state our main 
results, while relating them back to the biological motivation discussed above. In Section 3, we give a more 
detailed discussion of our results along with proofs. Finally, we end with some conclusions and suggestions 
for future work in this area. 

2 The Model: Networks and Mutations 
2.1 Definitions 

In this section we give the formal definition of the model we will investigate; we biologically justify the details 
of our formalism in Section 2.2. 

The phenotype is specified by gene expression levels, modeled as n boolean^ characters xi, . . . ,Xn € {il}- 
The phenotype (xi, . . . , Xn) is said to be viable exactly when f{xi, . . . , x„) = 1 where / : {±1}" —> {±1} is 
the constraint determined by the environment or genetic background to be satisfied. Wc call / the objective 
function. 

Next, we formalize the definition of the genotype by asserting that the genotype encodes a boolean 
network with the gene expressions as boolean variables. Such a boolean network N ~ {x,ui, . . . ,Un) is 
specified by n boolean variables x = {xi, . . . , Xn) S {±1}" and n corresponding update functions ui, . . . , u„ : 
{±1}" — > {±1} describing how the variable Xi depends on all other variables. 

Every network N = (x, ui, . . . , u„) induces a directed graph Gn = {Vn^En) on its variables Vn = 
{xi, . . . , Xn} where (x^, Xj) is an edge in En iff the variable Xi has an influence on the update function Uj of 
node Xj (i.e., 3xi, . . . , Xj-i, Xj+i 

, . . . ,Xn such that Uj{xi, . . . ,Xj = —1, . . . , Xn) 7^ Uj{xi, . . . ,Xj = 1, . . . , Xn)) The graph Gn describes how 
changes can in principle "spread" through the network N . 

Next we want to describe how exactly the states of the boolean variables of a network change over time. 
For this we define the configuration of a boolean network as an assignment a G {±1}" to all its variables (i.e. 
Vi = 1, . . . , n : Xi = a^). Then we define a dynamic system for every boolean network N ^ {x,ui, . . . , u„) 
together with an initial configuration a by inductively defining a sequence of configurations: 

x{l) = a 

X{t+1) = {ui{x{t)),...,Unix{t))) yt>l 

This dynamic system gives a sequence of configurations for every time t. For time t ~ 1 this is the initial 
configuration a;(l) = a and for later times the configuration x{t) is formed by applying the update functions 
on the last configuration. Note that the states of all the nodes are updated synchronously. 

boolean domain has exactly two values with interpretations as True and False. We use these interpretations with the 
functions V (OR), A (AND) and ~ (NOT) interchangeably with the values 1 (True) and -1 (False). 
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Figure 1: The boolean network on the left reaches the final configuration {xi, X2,X3,X4) = (1, —1, 1, 1). The 
mutated network on the right reaches the final configuration (a;i, X2,X3, X4) = (1, 1, 1, 1). 



Since the configuration space is finite and the dynamics of the network are deterministic, the network 
will eventually fall into a previously visited configuration, after which the configuration dynamics become 
periodic. This cyclic trajectory is called an attractor. If the attractor is just one configuration - a cycle of 
length 1 - it is called a fixpoint. It is clear that given an initial configuration a, if a network does reach a 
fixpoint starting from a, it is unique. Assuming the network N docs reach a fixpoint starting from a, we 
denote it by Fix(A^, a); if it does not, Fix(iV, a) is undefined. 

Next, we specify how mutations change the boolean network. As motivated below in Section 2.2, we look 
at mutations modifying the update functions that do not change the topology Gat of the network. We say 
that a network N' is a mutation of a network N ii N = (x,ui, . . . , Un) and TV' = (a;, wi, . . . , Vn) with each 
Vi either equal to Ui or — u^. Note that inverting an update function does not change whether one variable 
has an influence on an other variable, giving always Gn ~ Gn'- Figure 1 shows an example of a network 
and a mutation of it. Given a mutation parameter e G (0, ^), we define an e-mutation of a boolean network 
N ~ (x, Ml, . . . , u„) to be a random variable denoting a boolean network N' = {x,vi, . . . ,Vn) with the 
same variables but changed update functions. Specifically, independently for every i G [n], the new update 
function Vi is equal to its original Ui with probability 1 — e and its complement Vi = —Ui with probability e. 

Finally, we want to quantify how well a boolean network satisfies a given objective function in the presence 
of mutations. Formally, for a mutation parameter e e (0, i) and a survival probability 6 £ [0, 1] we say that 
a network N with an initial configuration a € {±1} is {e, 6)-robust with respect to an objective function 
/ : {±1}" {±1} iff the probability that the fixpoint x(oo) = Fix(A^',a) reached by an e-mutation N' of 
N satisfies /(.t(oo)) = 1 is at least S. Note that this definition only makes sense if the network N' reaches a 
fixpoint starting from a; see Section 2.2.2 for biological motivation. We say that the network N is optimally 
e-robust with respect to f if it achieves the largest survival probability S among all networks that are (e, 6)- 
robust with respect to /. 

The central question we address in this paper is whether given an objective function /, there exists a network 
N that is (e, (5)-robust with respect to / with e a constant and 6 very close to 1. In order to formalize "very 
close" , we parametrize the objective function by n, the number of boolean characters, and then desire that 
6 goes to 1 asymptotically as n approaches infinity (holding e constant). For a mutation rate e G (0, i), a 
family of objective functions {/„ : {±1}" {±l}}„=i^... is said to be e-robustly expressible iff, for every 
71, there exists a boolean network Nn on n variables and a configuration € {±1}" such that Nn with 
initial configuration is (e, (S„)-robust with respect to /„, where the survival probabilities 6n go to 1 when 
n approaches infinity. A family of objective functions is is said to be robustly expressible if it is e-robustly 
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expressible for some constant e £ (0, i). We will sometimes abuse notation by saying that an objective 
function (instead of a family of objective functions) is robustly expressible; here, the function is implicitly 
parametrized by its arity. 

2.2 The Greater Picture 

Let us reconnect the formal definitions above in Section 2 to the biological motivation discussed in the 
introduction. The genotype is described by a boolean network, which mimics the dynamics of the genetic 
regulatory network. Boolean networks are a well-established model to describe regulatory network interac- 
tions (see, for instance, the study of floral organ development using such a model in [ESPLAB04]). These 
network models are often devised by discretizing nonlinear continuous models but, for our purposes, this is 
inessential. For a given genotype represented as a boolean network, the phenotype is assumed to be the gene 
expression levels, given by the stable configuration of the network. As mentioned earlier, the correspondence 
between the gene expression levels and the stable configuration of the network has been shown to be valid in 
several biological experiments (see e.g. [AO03, MABOO, ESPLAB04, HlOO, HEBYI05, GdBLC03]). Never- 
theless one should note that modelling the phenotype by the gene expression levels is a gross simplification 
of reality. For example, the environment and other cpigcnetic factors also often play an important role. But 
as a first attempt at a systematic understanding of the robustness of regulatory networks, our model should 
suffice. 

The particular phenotypic property with respect to which robustness is ascertained is captured in the 
boolean objective function. Note that the objective function could be highly complicated. For example, if 
the objective function determines whether a piece of RNA forms a particular secondary structure, it would 
have to encode a procedure for determining the secondary structure from a given RNA primary structure (a 
problem for which no efficient algorithms are known in general). As environmental and background genetic 
conditions can be very complex, it is difficult to say anything specific about the structure of the objective 
functions that arise in nature. 

In our analysis of robustness, we hold the mutation rate constant and let the number of characters in 
the phenotype grow arbitrarily large, since in reality the number of genes is very large numerically and, in 
many models, the mutation rate is independent of the number of genes. We discuss some further modelling 
issues below. 



2.2.1 Degree of Regulatory Networks. 

In nature, regulatory interactions between genes is implemented through the presence of regulatory regions in 
the genome. Regulatory proteins, such as promoters and inhibitors, bind to this region and either encourage 
or discourage expression of the associated gene. The regulatory region is a short stretch of DNA and so it 
is not feasible to have too many regulatory proteins binding to a single regulatory region simultaneously. In 
our model, this means that the in-degree (the number of adjacent incoming edges) of nodes in the boolean 
network representing the genotype should be small. For a boolean network N, we term the maximum in- 
degree of a node in Gn the degree of the network A^. When the degree is 0, then the network essentially 
consists of a static configuration, which docs not seem too interesting from a biological perspective. And 
when the degree is n, a node in the network can potentially interact with every other node in the network, an 
impractical situation. We will be interested then in the tradeoff between degree and robustness of boolean 
networks. 

2.2.2 Attractors of Boolean Networks. 

As noted above, we associate phcnotypes with fixpoints of boolean networks. But clearly, boolean networks 
can also reach a cycle instead of a fixpoint as an attractor. In this paper, we do not consider such networks 
to robustly express any objective function. The primary reason for this restriction is that in most modelling 
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of real biological systems by boolean networks, the attractors reached are found to be fixpoints rather than 
cycles [ESPLAB04]. There are instances in the literature, such as [ABKR07], where the phenotypes are 
taken to be attractors, whether they be fixpoints or cycles. But it is not clear if such a setup is indeed 
reasonable from the perspective of modelling biological systems. 

2.2.3 Mutation Models. 

The literature shows many approaches for modelling how mutations act on the genotype, or the boolean net- 
work in our model. Mutations could modify the network by changing the adjacency relations (as in [BB08]), 
changing the regulatory interactions (as in [SDUT]), or duplicating and deleting nodes (as in [ABKR07]). 
For this paper we want to investigate the case when a mutation on the genotype modifies the regulatory 
interactions between nodes by changing selected update functions to their complements. We believe that 
such errors capture the most common types of mutations in nature. These mutations to the boolean net- 
work can either be genetic changes passed down from one generation to the next, or be due to environmental 
disturbances occurring in a single individual. 

It is important to notice that in our setting, some of the other choices of mutation models do not give 
interesting results. For instance, if errors arc only on edges, then a robust network would be a static as- 
signment, i.e., a network with no edges, which is not so interesting. Also, if mutations change the update 
functions arbitrarily, then the graph induced by the network changes arbitrarily which would make robust- 
ness with respect to non symmetric objective functions impossible to achieve. Additionally, it is not very 
biologically motivated to consider a gene changing its entire set of regulators after one mutation. Thus, for 
a mutation model to be mathematically interesting and biologically relevant, we require that the network 
robustness should not decrease if the network degree is increased and that if N' is a mutation of N, Gm' 
should be somehow very closely related to Gn- In this paper, our mutation model obeys Gjy = Gn'- 

For other mutational models which do give interesting results, such as errors on both edges and nodes, 
we believe that many of the results presented here apply in spirit to those models as well and, also, that 
many of the techniques we describe for constructing robust networks have analogues in the alternate models. 

2.2.4 Constructivity of Robust Networks 

Robust expressibility of an objective function just asserts that there exists a network robustly expressing it. 
However, for nature to be able to use the robustness property, it needs to be able to construct the network in 
some fashion. Perhaps it starts with a non-robust network and makes small changes to it to drive it toward 
robustness, in an evolutionary process, or perhaps it uses some other procedure, but a basic requirement for 
the robust expressibility property to be useful is that the robust network be efficiently constructible. More 
precisely, for an objective function / : {±1}" — > {±1}, we say that a network N robust with respect to / is 
efficiently constructible if there is a polynomial time algorithm that has access to / as an oracle and that 
outputs a description of N. For an arbitrary objective function, one cannot hope for efficient constructibility. 
However, one would like to show efficient constructibility for objective functions belonging to special families 
which have small description size. Also, note that efficient constructibility is a weak condition to impose, 
since nature might be restricted to a weak computational model. 

2.3 Previous Work 

The study of evolvability and the origin of novelty in biological systems is an intensively studied area in 
biology. The Plausibility of Life ([KGUG]) by Kirschner and Gerhart is a great introduction to the current 
understanding in this area, containing pointers to many relevant articles in the field. The connection between 
robustness and evolvability is described well in [WrtgO-'i] and in [CMW07]. 

Boolean networks were originally introduced by Kauffman in [Kau69]. In The Origins of Order ([Kaii93]), 
Kauffman explains his position that biological systems display the properties of an ensemble of random 
boolean networks with parameters that make them lie at the 'edge of chaos', that is, near a statistical phase 
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transition. The concept of designing boolean networks to solve specific problems is a newer idea, discussed 
by Hasty et. al. in [HMC02] for example. Genetic regulatory networks have been modelled by boolean 
networks in many parts of the literature now ([AO03, MABOO, ESPLAB04, HIOO, HEBYI05, GdBLC03]). 

Robustness of boolean networks against various types of faults have also been under investigation. 
[ABKR07, SR08] study the resilience of random boolean networks to mutations in the update functions, 
while [BB08] and [SD07] study resilience of single boolean networks to other types of faults. These pa- 
pers offer numerical/experimental evidence in favor of robustness and do not study how robustness arises 
formally. In a non-biological context, robustness was also studied by Hornby in [Hor04] in the context of 
developing programs expressing the design of furnitures! He was interested in representations of tables that 
one could mutate and still retain a table generating program (but with other potentially useful features). 
Finally, cvolvability of evolutionary programs was studied by Rcisingcr and Miikulaincn in [R]\107], but in 
an empirical fashion. 

As far as we know, this is the first rigorous investigation of robustness and cvolvability. As mentioned 
above, there has been a stream of previous papers examining theoretical models for the regulatory system 
and then showing through simulation how robustness can arise. What we believe is novel in our current work 
is that we obtain a mathematical understanding of how the level of robustness is related to the structure of 
the regulatory network model. There have been previous papers which have suggested the need for a formal 
study, such as [WA96] and [VHW+03]. Furthermore, the philosophical underpinnings of our study are 
slightly different from most earlier work. Most of the previous papers model the genotype to be uniformly 
generated by a process that is described by a few parameters. For instance, in the standard Kauffman 
model, random boolean networks {NK networks) are uniformly generated by a probabilistic process that 
takes as inputs a parameter N for the number of nodes and a parameter K for the degree of each node. 
On the other hand, the genotype model in our work is far richer. We allow the genotype structure to be 
as complicated as desired and then optimize for robustness. The difference between the two approaches is 
characteristic of the difference between the methodologies of physics and computer science. Carlson and 
Doyle [CD99, CDOO, DCOO] have introduced a conceptual framework, known as Highly Optimized Tolerance, 
that argues that biological systems are highly structured and optimized for robustness and that they must 
be described by a large number of parameters; our work can be loosely viewed as fitting into this framework. 

2.4 Our Results 

We start our investigation of robustness by focusing on the special class of acyclic boolean networks. These 
networks have the feature that they are guaranteed to reach fixpoints starting from any initial configuration. 
They are also mathematically easier to handle than general boolean networks. Although structurally quite 
simple, we show in Section 3.2 that they already are, very often, more robust than simple static assignments. 
In Section 3.2.1, we start by formally defining acyclic boolean networks and characterize them explicitly 
algebraically. Then, in Section 3.2.2, we discuss a connection between acyclic networks robust with respect 
to an objective function and decision trees for that function. Using this relationship, we give a procedure 
for constructing the optimally robust acyclic boolean network with respect to an objective function in time 
quasipolynomial in the truth-table size of the function^. In Section 3.2.3, the algebraic characterization 
from Section 3.2.1 is used to show that if a low-degree acyclic network is robust with respect to an objective 
function, then the configurations expressed by the network lie with high probability in an efficiently learnable 
subset of the phenotype domain. This is interesting because it suggests heuristically that the need to achieve 
robustness necessarily constrains nature to generating phenotypes that lie in an "easily describable" set. 
This contrasts with the usual belief that biological structures are somehow very complex and inexplicably 
varied. In Appendix A, we show that the algebraic characterization can be used to describe a large class 
of function families robustly expressible by constant degree, acyclic boolean networks. These functions are 

^In fact, as discussed in Section 2.2.4, efficient constructibility requires that the optimaUy robust network be found in time 
polynomial in n, not the truth-table size of 2" . Using the connection with decision trees, we show efficient constructibility for 
symmetric functions, which can be represented concisely. 
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described as polynomial threshold functions, which are well-studied objects in computer science (see [Sak93] 
for a survey). 

In Section 3.2.4, we show separation between various classes of acyclic networks in terms of their ability 
to robustly express objective functions. First, we show explicit objective functions which can be robustly 
expressed by (even low degree) acyclic networks but are far from being robustly expressible by a static 
assignment. Then, we prove that acyclic boolean networks of sub linear degree can robustly express only a 
tiny fraction of functions of constant density. On the other hand, a random function of constant density is 
expected to be robustly expressed by an acyclic network with no degree restriction. Hence, since we know 
that in biology, many regulatory networks are of low degree, this suggests that either the objective functions 
are chosen from the "low-complexity" set of functions we characterized earlier or that the networks in biology 
are cyclic. 

In Section 3.3, we investigate cyclic networks (networks that potentially possess feedback loops). We are 
interested in networks that always arrive at fixpoints, even in the presence of mutations, but when started 
from specific initial configurations. We show that cyclic networks can constrain the fixpoint configurations 
much more than acyclic networks. A little more precisely, one implication of our results in Section 3.3 is 
that cyclic networks can force the fixpoint configurations to lie in a set of 2'^''1°k"\ in contrast to acyclic 
networks which can only robustly express functions that are satisfied by at least 2^^'"^ configurations (see 
Section 3.2.3). This gap separates cyclic from acyclic networks. We also show that any objective function 
which can be satisfied by fixing the values of a few variables is also robustly expressible by cyclic networks. 
Conceptually, this means that if the genome needs some of its genes to have some fixed expression levels, 
then it can expend only a logarithmic number of regulatory genes in order to make them fixed with high 
probability. We also obtain stronger results which show, for instance, that robust expressibility is guaranteed 
whenever we have a set of variables of size O( iog„ ) that can compensate for any mutations in the remaining 
variables. These examples stand in strong contrast to the negative results for acyclic networks and suggest 
the combinatorial power of cyclic networks. 



3 Results and Proofs 

3.1 Mathematical Preliminaries 

The mathematical notation used in this paper is fairly standard. For integer n > 1, we use [n] to denote 
the set {1, . . . ,n}. Below, we explain the Landau notation, and then we describe two results on random 
variables that are used frequently in the following sections. 



3.1.1 Landau Notation 

Big O notation (i.e. O, o, u) or Landau notation describes the limiting behavior of a function when 
the argument tends towards a particular value or infinity. Throughout this paper it is used to describe the 
asymptotic behavior of quantities depending on the number of network nodes n with n increasing towards 
infinity. In the rest of the paper we omit the specification n oo. The formal definitions are as follows: 



f dig ■^=^ f{x) = 0{g{n)) limsup 



/ -< .9 fix) = o{g{n)) ■^=^ limsup 



f >g f{x) = ^{gin)) limsup 



gix) 

m 



g{x) 



g{x) 
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f{x) = uj{g(n)) 



lim sup 
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Intuitively e.g. f{n) — 0{g{n)) means that the quantity / is (asymptoticahy) not bigger than (^) 
g for large networks (large n). All these comparisons do not take constant factors into account. 



than 



3.1.2 Union bound 



The Union bound is the simplest way to bound the probability of an event X which can not occur without 
one other event Xi,X2, ... taking place, i.e. X C Xi n X2 n . . .. Very intuitively the probability that event X 
occurs is bounded from above by the sum of the probabilities that one of the other events occurs, precisely: 



Pi\Jx,)<Y,P{x.) 

i i 

3.1.3 Chernoff bounds 

The Chernoff bound is a concentration bound for the sum of independent variables. It states that the 
probability that the sum of outcomes of independent random experiments deviates from its expectation by e 
decreases exponentially in e. There are multiple forms of this bound. The ones used throughout this paper 
are: 



Theorem 3.1 Let X = '^i^[n]X-i where Xi for each i G [n] are independently distributed in [0,1]. Then, 
foree (0,1).- 



P (X > (1 + e) E[X]) < exp ( -y E[X] 



P{X <{l-t) ¥\X]) < exp ( -—E[X] 



3.2 Acyclic Networks 

We start our investigation of the robustness property by restricting attention to a subclass of boolean 
networks. Naturally, because the model is weaker, we can obtain stronger results here than we can in the 
more general setting of unrestricted boolean networks to which we return later. 



3.2.1 Definition and Characterizing Properties of Acyclic Networks 

We are interested in networks that reach fixpoints starting from an initial configuration and after arbitrary 
mutations. To simplify our task, let us restrict ourselves to networks that reach the same fixpoint regardless 
of the initial configuration and such that fixpoints reached by different mutations are different. More formally, 
we say that a boolean network N is feed-forward if: 

(i) for any mutation N' of N and for any initial configurations ai, 0:2 G Fix(A^', ai) = Fix(iV', = 
F\x{N') 

(ii) for any two non-identical mutations N' and N" of N, Fix(A^') ^ Fix(iV") 
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Thus, in feed-forward networks, the initial configuration is irrelevant in determining the fixpoint, and there 
is a bijcction between mutations of a network and their fixpoints. 

Our main object of study in this section is a special type of feed-forward network. We say a boolean 
network N is acyclic if any network N' with Gjv' = Gn is feed-forward. So, in particular, any mutation of 
an acyclic network is also acyclic. The name arises from the following simple claim: 

Theorem 3.2 A boolean network N is acyclic iff Gn is a directed acyclic graph. 

Proof: If Gjv is a directed acyclic graph, then TV is feed-forward because one can update the nodes of N 
in sequence determined by a topological order on Gat. It is clear then that the stable configuration reached 
is a fixpoint, independent of initial configuration. To see that two non-identical mutations reach different 
fixpoints, consider the first node in the topological order that is mutated in one network but not in the other, 
and observe that their states in the fixpoint configurations of the two networks will be different. 
If Gn contains a directed cycle, one can choose update functions for the nodes that make the induced network 
have a cycle attractor. This can be done simply by making sure that there is no assignment to the nodes 
that satisfies all the update functions on the cycle simultaneously. I 

Next, we show that by viewing the effect of mutations more algebraically, we can precisely describe the 
structure of objective functions which can be robustly expressed by acyclic networks. In doing so, we get 
a better understanding of how the degree of the network limits the class of functions robustly expressible 
and how fixpoints of the mutations of acyclic networks are distributed. First, some notation. Define the 
e-biased product measure /i^ on {±1}" by fi<:{xi, . . . ,Xn) = e"~'^(l — e)'' where k = \{i : Xk = 1}|. We may 
view a function / : {±1}" {±1} as the characteristic function of a subset of {±1}", that is, the subset 
{x £ {±1}" : f{x) = 1}. Then, iJ,i{f) denotes the weight assigned by the measure /ig to the set characterized 
by /. Our main observation is the following. 

Lemma 3.3 / : {±1}" —>■ {±1} is e-robustly expressible by an acyclic boolean network of degree d if and 
only if there exist tt, g, tpi, . . . , tpn such that: 

f{xi, . . .,Xn) = g{x^{i) ■ tfl{),X^(2) ■ '4^2{Xjr{l)), ■ ■ ■ )) (1) 

where: 

(i) t: : [n] [n] is a permutation, 

(ii) for i G [n], (fi : {±1}* ^ {il} o, boolean function depending on at most d inputs, and 
(Hi) g : {±1}" ~> {±1} such that fi^{g) > 1 - o(l). 

Proof: To prove one direction, suppose / is e-robustly expressed by an acyclic network N of degree d. 
Since Gn can be topologically ordered, there exists a permutation n : [n] ^ [n] such that there is an edge 
between node i and node j in Gn only if tt~^{x) < 7r~^(y). For every i £ [n], let (pi denote the update 
function associated with node 7r(i) in the network. Note that for any i, the function ipi can only take as 
arguments at most d elements of the set {x^(^jj}j<:i. Let g{si, . . . , s„) = f{xi, . . . , j:„) where inductively, 
^TT^i) = STr{i)Vi{^Tr{i)T ■ ■ ^ ^TT{i-i)) for cach i e [n]. One can explicitly verify now that Equation (1) holds 
for this choice of g: If s^(i) = a;^(j) • <Pi(a;,r(i), a;^(2) , . • • , then by our choice of g, g(si,...,s„) = 

/(xi, . . . ,Xn)- Now, in an e-mutation of N, each x^^^^i) ■ a;7r(2) , • ■ • , 2;7r(i_i)) is independently 1 with 

probability 1 — e and —1 with probability e. By definition of e-robust cxprcssibility, g : {±1} {±1} is 
such that /i£((?) > 1 — o(l). 

The proof in the other direction is similar. Given the permutation tt and the functions ipi, . . . , ipn, simply 
define a boolean network N where tt gives the ordering of the nodes and the (pi 's specify the update functions 
of the nodes. Then, the condition on g ensures that / is robustly expressed by the network. I 
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Figure 2: For the objective function f{xi, a;2, X3) = (xi AX2) V {X2AX3), the decision tree is shown on the left. 
Each node v in the tree is associated with a vahie b{v) £ {±1}, as described in the text, and the adjacent 
outgoing edge from v labeled b(v) is in bold. The corresponding boolean network is shown on the right. 



3.2.2 Optimal Networks from Decision Trees 

Furthermore, as we show next, we can construct the optimally robust acyclic network for a given objective 
function in time quasipolynomial in the truth-table size of the function. For this, let us recall the notion of 
a decision tree for a boolean function / ; {±1}" ^ {±1}. It is a rooted binary tree Tf where each edge is 
labeled —1 or 1, each non-leaf vertex is labeled with a variable, and each leaf vertex is labeled with a —1 or 
1. The decision tree Tf computes / in the natural way: any assignment to the variables determines a unique 
path from the root to a leaf, and the label of the leaf at the end of the path is the value of the function 
applied to the assignment. 

Now, with any decision tree Tf for an objective function /, we associate a boolean network Ntj ■ Given 
a mutation parameter e S (0, ^), the decision tree is first preprocessed as follows. At each non-leaf node v 
of the decision tree, we associate a real number s{v) G [0, 1] and a bit h(v) G {±1}- For a leaf node v, s(v) is 
defined to be equal to 1 if the leaf label is 1 and otherwise. For a non-leaf node v with its two child nodes vi 
and V2, h(v) is equal to the label of the edge leading to the child argmax^^gj-^^ ^^} s(w), and s{v) is equal to 
(1 — e) max(s(wi), s{v2)) -I- emin(s(wi), s{v2)). That is, we define s{v) and b(v) iteratively from the leaf nodes 
up to the root. Now, the boolean network Nt)- = {x,ui, . . . , u„) is defined by setting each update function 
Ui to output the value b{v), where v is the first node in Tf labeled xt obtained by following edges of the 
decision tree down from the root. (If Xi is not reached, then Ui can be arbitrary.) The degree of the network 
constructed thus is at most n. It is clear that if Tf is a layered decision tree, i.e. nodes at the same distance 
from the root have the same label, then the network is acyclic. Figure 2 illustrates the construction of 
the network NTf from the decision tree Tf for the objective function f{xi,X2,X3) = (xi A X2) V {x2 A X3). 

Theorem 3.4 For a given objective function / : {±1}" {±1} md e e (0, the acyclic boolean network 
that is optimally e-robust with respect to f is a network for some layered decision tree Tf for f . 

Proof: Fix a total order among the variables xi, . . . ,Xn- We show that if Tf is a layered decision tree for 
/ reading the variables in the selected order, then is the optimally e-robust network with respect to / 
among those acyclic networks N for which the DAG Gn is consistent with the selected ordering. To see this, 
use induction on n. If n = 1, simply outputting the bit which satisfies / is optimally robust. For n > 1, 
consider the first node in the total order. In the optimally e-robust network, it must be that after the first 
node has set its state, the remaining network on n — 1 nodes must also be optimally e-robust with respect 
to the function on n — 1 bits induced after setting the first node. So, the first node must set its state to the 
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bit such that the network on n — 1 bits with the higher survival probability is chosen. But this is exactly 
how the network is constructed. I 

Therefore, the time needed to construct the optimally robust acyclic network with respect to a given 
objective function / is at most n! • 0(2") < 2'-'("'°s"), which is quasipolynomial in 2", the description length 
of an arbitrary boolean function. For an objective function that is guaranteed to have small description 
length, one could hope for a much faster algorithm. Our only result in this direction is the following for 
symmetric functions; a function / : {±1}" {il} is said to be symmetric if for any permutation tt on [n], 

/(xi , . . . , Xn) — /(^7r(l) 1 ■ ■ ■ 1 •^■K{n) ) ■ 

Theorem 3.5 The optimally robust acyclic network for a symmetric function can be constructed in 0{ji^) 
time. 

The key idea for the proof is to specialize the decision tree algorithm for symmetric functions so as to reduce 
the number of queries. 

Notice that our construction of the network from the decision tree Tf did not depend on the fact 
that Tf was layered. We term the networks arising from decision trees as pseudo-acyclic boolean networks. 
One can easily check that any pseudo-acyclic boolean network is feed-forward. In fact, we conjecture that 
any feed-forward network is also pseudo-acyclic, and so, in some sense, pseudo-acyclic networks lie on the 
border between acyclic networks and general boolean networks for which all mutations reach fixed points 
(but perhaps starting only from certain initial configurations) . The optimally robust pseudo-acyclic boolean 
network with respect to a given objective function can be found by enumerating over all decision trees Tf 
of a function and maximizing over the survival probabilities of ■ The correctness argument is similar to 
that of Theorem 3.4. Most of our results in the following subsections can be translated to the pseudo-acyclic 
setting. 

3.2.3 Robustly Expressible Polynomial Threshold Functions 

In this section, we show connections between robust expressibility by acyclic networks and polynomial thresh- 
old functions. A function / : {±1}" — > {±1} is said to be a polynomial threshold function of degree d iff 
/ can be written as sgn(p(a:i, . . . ,a;„)) where p : {±1}" — > R is a polynomial** with real-valued coefficients 
of degree at most d and where sgn : M {il} is the sign function which takes any negative input to — 1 
and any non-negative input to -fl. Polynomial threshold functions are well-studied objects in theoretical 
computer science, arising for instance in learning theory and circuit complexity. In particular, low-degree 
polynomial threshold functions have been shown to be easy to compute in several natural computational 
models. 

The characterization in Lemma 3.3 immediately leads to the following implication for function families 
robustly expressible by acyclic networks. 

Theorem 3.6 // / : {±1}" {±1} "is robustly expressible by an acyclic boolean network N of constant 
degree d, there is a function P : {±1}" {il} such that: 

(i) P is computable by a polynomial threshold function of degree 2d + 2, 

(ii) at least 1 — o(l) fraction of the solution set of P satisfies /, and 

(Hi) the probability that an e-mutation of N expresses a configuration satisfying P is at least 1 — o(l). 

Proof: Suppose / is e-robustly expressed by an acyclic network of degree d. We use Lemma 3.3 to write 
/ as .g(a:;^(i) •(/?i(),x^(2) •V32(x^(i)), ■ ■ . , a;^(„) • <p„(a;^(i) , a;^(2) , ■ • ■ , a:;7r(n-i))) where each ipi is of constant arity 
d and g is such that ^i{g) > 1 — o(l). For i G [n], set Si = x^i^j;-^ ■ iy9i(x„(i), . . . , X7r(,;_i)). As we showed in the 
proof of Lemma 3.3, there is a bijective correspondence between (xi, . . . , Xn) and (si, . . . , s„). 

^Because the inputs are {±1}, we can assume the polynomial to be multilinear without loss of generality. 
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Figure 3: Mutations of a constant-degree Boolean network robustly satisfying / express configurations that, 
with high probability, lie in the solution set of P. The solution set of P is an efficiently learnable set, almost 
completely contained in the solution set of /. 



Let R = {(a;i, ■ • ■ , Xn) G {±1}" : ^"=1 ^ [(^ ^ 3e)?i, (1 — e)n]}. We will construct P so that its solution 
set is R. In a configuration expressed by e-mutation of N, each Si is independently +1 with probability 1 — e 
and —1 with probability e. Therefore, by standard Chernoff bounds, a configuration (xi, . . . ,Xn) expressed 
by an e-mutation of N satisfies the property that | — (1 — 2e)n\ < en with probability at least 1 — o(l), 
proving part (iii). Moreover, because n^^ig) > 1 — o(l), it follows that Prs[(7(si, . . . , s„) = 1] > 1 — o(l) where 
s = (si, . . . , s„) is drawn uniformly from the set {(si, . . . , s„) S {±1}" : (1 — 3e)n < J2i 5: (1 ^ 
proving part (ii). Finally, for (i), observe that R is the solution set for 

P(a;i, . . . , a;„) = sgn s; - (1 - 3e)n^ ^(1 - e)n - ^ s; 

Expressing each tpi as a multilinear polynomial of degree d, each Si becomes a multilinear polynomial of 
degree d+ 1. Therefore, P is the sign of a {2d + 2)-degree polynomial, proving our theorem. I 

As stated below in Corollary 3.7, Theorem 3.6 implies that a degree bound on an acyclic boolean network 
means that it can only robustly express those functions which contain a "simple" subset. In other words, 
no matter how complicated the objective function / is, a low-degree acyclic network robustly expressing / 
maintains assignments that solve a "simpler" subfunction of /. The biological implication is that if there is a 
population of organisms trying to satisfy the same environmental constraint, then the pheno types displayed 
in the population can be efficiently described. This conclusion goes against the commonly held belief that 
phenotypes occurring in biology are varied in a very complicated fashion. 

To state the corollary precisely, we use the notion of PAC (probably approximately correct) learnability, 
the most commonly used theoretical framework in machine learning. Essentially, a function is said to be 
PAC-lcarnable if there is an efhcient algorithm that can use random example evaluations of the function 
with respect to some probability distribution to learn the function with high probability on a large fraction 
of the domain; see [KV94], for instance, for more details. 

Corollary 3.7 For a function f : {±1}" {if} o.'^'d e £ (0, if the solution set of f does not contain a 
set S of size at least^ 2^('^)"^ that is PAC-learnable in polynomial time with respect to the uniform distribution, 
then there is no acyclic constant- degree boolean network that e-robustly expresses f . 



H(-) is the binary entropy function: for p G (0, 1), H{ji) = — plog2 p — (1 — p) log2(l — p). 
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Proof: Suppose otherwise. Let R be the solution set of the function P guaranteed by Theorem 3.6, and 
let 5 = i? n {x = {xi, . . . ,Xn) ■ f{x) = 1}. By the proof of Theorem 3.6, the size of S is at least 
(1 ~ o{l))J2''^~^ll3^)„#{x e {±1}" ■■ T,^x^ ^ w} > 2(^(i-5<^)-°(i))" by Chernoff bounds. Moreover, S 
can be (1 — o(l))-approximated with respect to the uniform distribution by a constant-degree polynomial 
threshold function, which can be PAC-learned from random examples in polynomial time [KS04, I\0SU4]. 
I 



3.2.4 Dependence of Robustness on the Network Degree 

In the next two sections, we show results in different directions that illustrate how larger degree networks 
can robustly express more objective functions. 

Advantage of Networks over Static Assignments 

The class of objective functions shown in Appendix A to be robustly expressible provide examples of 
cases in which acyclic networks are strictly stronger than static assignments (degrec-0 networks) in the sense 
that they are able to robustly express functions which arc out of reach for static assignments. 

Corollary 3.8 For any constant e G (0, 1/2), there is a family of functions /„ : {±1}" — > {±1} such that 
it is (e, 1 — 2^^^^^^ ) -robustly expressible by a degree-2 boolean network but for which there does not exist a 
static assignment with survival probability 2^°(^-' . 

Proof: For each n, consider g„{xi, . . . ,Xn) = sgn(2;ia;2 + xyxj, + • • • + x\Xn). g-n satisfies the conditions of 
Theorem A. 3 and hence is robustly expressible by a degree-2 boolean network. On the other hand, for any 
static assignment, x\ could be assigned to the complement of sgn(a;2 + ■ ■ ■ + a^n) with constant probability 
e, so that an e-mutation of the assignment would not satisfy g^ with probability 1 — e. 

Now let 

/„(a;i, . . . ,a;„) = sgn(-(l - 4e)Vn + .9^^(2:1, • ■ --.x^ 

+ 5yrr(^n-yn+l: ■ ■ ■ i ^n) ) 

A Boolean network robustly expressing /„ is simply the disjoint union of the ^Jn Boolean networks expressing 
each of the g^'s. The probability that an e-mutation of the network expressing /„ is 1 — 2~^^^\ by the 
Chernoff bound. For a static assignment, we argued above that a static assignment can express g^ with 
probability at most 1 — e; hence, the expected value of g^ with respect to e-mutations is 1 — 2e. Again, by 

the Chernoff bound, the survival probability of a static assignment for /„ is then at most 2^^^(\/"). I 

In the case where we don't care about the degree we can even show exponentially small survival probability 
for a static assignment on some functions which are robustly expressible by acyclic networks. (See Appendix 
B.) 

Most Functions need Full Degree Networks 

We show in this section that random functions can not be robustly expressed by (pseudo) acyclic networks 
of bounded degree. We give proofs and interesting evidence that unbounded degree acyclic networks lie very 
close to the boundary of expression power needed to express the vast majority of functions. For p G (0, 1) 
and n a positive integer, let Fn^p denote the distribution on functions mapping {±1}" to {±1}, induced by 
letting each entry of the truth table of the function be 1 with probability p and — 1 with probability 1 — p. 
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Theorem 3.9 For constants e G (0, i) and p £ (0, 1), with probability at least 1 — o(l), there is no network 
N of degree o{n) that e-robustly expresses a function chosen uniformly at random from Tn^p. 

Proof: Fix a network N with maximum degree d. If a boolean function / is e-robustly expressed by N ^ 
then by Corollary 3.7 / must be satisfied on a set of size at least'' 2^*^*^^". Therefore, the probability that a 
function uniformly chosen at random from Tn,p is satisfied on this set is at most . The total number 

of pseudoacyclic boolean networks of degree at most d is at most 2^''"7i'^("). Therefore, applying the union 
bound: 

Pr [37V of degree d e-robustly expressing/] < ' "2^ "„o(") < ^(i) 

if (i = o(n). I 

While the last theorem showed that acyclic networks with slightly less than full degree are not able to 
robustly express most functions the next theorem shows that without this small restriction acyclic networks 
can be found on the boundary of being capable to robustly express random functions with fixed density 
p. The next theorem shows a tradeoff between the density parameter p which determines how hight the 
percentage of viable configurations is in expectation, the mutation parameter e and the survival probability 
b. 

Theorem 3.10 For any p G (0, 1) and e G (0, i), if a function f is chosen uniformly at random from J-n,p, 
then there is a boolean network N that, in expectation, (e, 1 — (1 — p)^°^~) -robustly expresses f. In other 
words, for any survival probability 6 G (0,1), there is a mutation parameter e G (0,1) such that a boolean 
network N {e, S) -robustly expresses f in expectation. 

The idea of the proof for Theorem 3.10 is to use our procedure for constructing optimally robust acyclic 
networks from decision trees as described in Section 3.2.2 and then to lower-bound the success probability 
of the resulting network. We do not have a proof that the lower-bound is tight, and so, it might even be 
possible that a random function from J^n.p is expected to be robustly expressible by an acyclic network. 

In spite of the power of acyclic networks as demonstrated above, there are still some important classes 
of objective functions that are not robustly expressible by them. The next theorem gives some examples: 

Theorem 3.11 

• If a function f : {±1}" {il} «s a k-junta for constant k, i.e. f depends only on at most k 
variables, then the optimally robust acyclic network with respect to f has constant success probability. 
For example, dictator functions (f{xi, . . . ,x„) = Xi for some i G [n]) are not robustly expressible by 
acyclic networks. 

• Suppose f : {±1}" — s- {±1} is a symmetric function; then there exists a function g : [0,n] {il} 
such that f{xi, . . . ,Xn) = g ) for all x G {±1}". // g has the property that for any constant 
sized interval I C [0, n], there exists s £ I such that g{s) = 0, then the optimally robust acyclic network 
with respect to f has constant success probability. For example, parities (f{xi, . . . , a;„) = 0"=! ^i) '^'"'^ 
not robustly expressible by acyclic networks. 

3.3 Cyclic Networks 

In this section we show that using the full power of cyclic networks gives significantly more robust functions. 
As a main result we give a construction of networks always converging to fixpoints which have o{j^^) 
variables nearly fixed in dependence on all other variables. This allows to robustly express dictator functions, 
many symmetric functions, o( j^^^ ) -juntas all functions shown (i.e. in theorem 3.11) to be way beyond the 
reach of acyclic networks. This shows that the dynamic behavior of acyclic networks can be used to stabilize 
the potential disruptions of highly critical parts of a regulatory network by random mutations while still 
allowing evolution and changes in other parts. 

^-ff(-) is the binary entropy function: for p G (0, 1), Hiji) = — plog2 p — (1 — p) log2(l — p). 
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Theorem 3.12 There is a cyclic network N = {x,u) on 2T variables and a start configuration y with the 
property that with probability 1 — c~'^ an e-mutation N' starting from the configuration y converges to the 
fixpoint Fix{N' ,y) in at most 3 steps and the value of a specified variable Xi in the fixpoint is 1. The value 

(l-2e)^ 

for Cf depends on the mutation rate e G (0, 5) and is at least e i-« > 1. 

Proof: Wc explicitly describe the network. The network consists of the variable xi and a set / of |/| = 2T— 1 
indicator variables which are supposed to detect when the variable Xi was affected by a mutation. Their 
update function gets —1 when xi = — 1 or when the majority of the indicator variables is —1. The value of 
the update function ui for the variable Xi is the value of the majority of the indicator variables. The start 
configuration y for the network is the all one vector. 

For the analysis of this network we want to prove that for any network TV' = (x, v) which differs from N by 
less than T mutations a fixpoint a;'(oo) = F\x{N',y) with j:'(oo)i = 1 is reached. For this we sec that the 
network starts at t = with the all zero configuration x'(l) = y = 1 and that at the next time t ~ 2 exactly 
all variables whose update function got mutated turn into —1. In the case where ui was not mutated xi is 
still 1 and this is the fixpoint reached by the network. To see this observe that less than T vote switches 
are not enough to switch the majority vote of the indicator variables. This leaves us with with the case that 
u'l = —Ml in which the value of xi at time t = 2 is 1. For t = 3 this results in all unmutated indicator 
variables getting 1. Since the unmutated indicator variables form a majority they stay —1 for all t > 3. This 
makes the value of the variable xi at any time f > 4 to be x'i{4) ~ u[{x'^) ~ —ui{x'^) = —(—1) = 1. Thus 
at t = 4 for the desired fixpoint is reached. 

Since all mutations occur independently with probability e the probability that an e-mutation N' differs 
from the network N in less than T mutations is exactly X]fc=o^ Ck)^^ ~ e)'^e^"^. By standard Chernoff-bound 

_ 2((l-e)2T-r)^ (l-2t)^ _ 

this is at least 1 - e i.i-^)2t = 1 - with = e i-^ . I 

Instead of just fixing one variable we can apply the above theorem multiple times and show that any 
objective function which can be satisfied by fixing the values of a few variables can be robustly expressed by 
a cyclic network. 

Theorem 3.13 Any boolean function which can be satisfied by fixing the values of o( iog„ ) variables is 
robustly expressible by a cyclic network. 

Proof: By assumption there are s = o( ) variables and a partial assignment y assigning each of these 
variables a value such that every configuration which agrees with y on these variables is viable. For each 
of the s variables we use a network with 2T = ^ = w(logn) variables as described in Theorem 3.12 to 
fix its value to the one given by y. This results in a network TV of size n. With probability (1 — c~'^Y > 
1 — ncc = 1 — 0(1) an e-mutation TV' will have less than T mutations in each subnetwork and converge 

to a viable fixpoint. I 

Our next theorem substantially extends the range of robustly expressible objective functions. It shows 
that it is possible to even robustly express any functions for which any assignment can be made satisfying 
by changing only a few variables. This is much stronger than the last theorem since it allows us to choose 
our values according to all other variables instead of fixing the values beforehand. 

Theorem 3.14 Let f be a boolean function f on the variable set X ~ {xi , . . . , a;„} and S — {xi-^ , ■ • ■ , ^i^si } *- 
X be a subset of variables of size 5=15*1= o( iogn )' V for any partial assignment Xx\s ~ V of values to 
variables in X \ S there is an assignment xs,y to all variables in S which completes the partial assignment 
to a viable configuration, than f is robustly expressible. 

Proof: We directly construct the network expressing /. We always choose the all-one-configuration as the 
start configuration. We divide the variables in X \ 5 in s -f 1 blocks 6*0, . ■ . ,8^ of size t = "s\+i ~ ^i^ogn). 
For each j = 1, . . . , s the block Sj is assigned to the variable Xi- , while the variables in 5*0 are sync variables. 
Figure 4 illustrates the partitioning of the nodes. 
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Figure 4: Shows the partitioning of the nodes in the network in the proof for Theorem 3.14. 



The idea is that it is very unhkely that more than ^ mutations occur in any of the blocks of size t. Having 
this in mind we want to proof that any network N' ^ N which differs from N in less than | mutations in 
every block converges against a viable configuration. 

The update function of every sync variable a; G S'o is 1 iff all variables in X are 1. For every j = 1, . . . , s the 
update function for the variable Xi- is 1 if every variable in X is 1 and else outputs the value of its variable 
in the assignment xs^xx\s times the value of the majority of variables in Sj (i.e. inverting it if the majority 
is —1). Lastly for every j = 1, . . . , s the update function of variables in a block Sj are —1 if (the variable Xi. 
is —1 and the majority of 5*0 is 1) or if the majority of Sj is —1. 

This has the following effect: The all-one start configuration is stable in N giving a 1 in every unmutated 
variable for < = 2 and a —1 in every mutated variable. Since N ^ N' there is at least one variable —1 for 
t = 2 giving a value of —1 for all unmutated variables in 5*0 for t = 3. Since the majority of variables in 5*0 
is not mutated the majority of 5*0 is fixed to —1 for t > 3. The majority of variables in any other block Sj 
is 1 for t ~ 1 and thus also for i = 2 since x^^. (1) = 1. At t ~ 3 the all unmutated variables in Sj update to 
the value of Xi^ (2) which we have already argued is —1 iff this variable is mutated. Since the majority of So 
is —1 for t >3 the update function for the unmutated variables in 5*^ reduced to being —1 iff the majority 
of them is —1. This inductively preserves the value of Xi .{2) for the majority of Sj for all t > 3. Taking all 
this together shows that all variables in X \ S* are fixed for all t > 3. This gives fixed values for variables 
in S for i > 4. Thus we have a fixpoint and we can verify indeed that for all unmutated variables Xi- £ S 
the majority of the corresponding block Sj is 1 and the value of Xi^ in the fixpoint is the value in xs,xx\s- 
If the variable Xi^ £ 5 is mutated in N' the majority of the corresponding block Sj is —1 and the value of 
Xi- in the fixpoint is —1 • —1 • xs,xx\s ~ ^s,xx\s well. By assumption the resulting fixpoint is a viable 
configuration. 

This shows that if at most | mutations occur in every of the s + 1 blocks and at least one mutation occurs 
in total, the resulting network N' reaches a viable configuration as a fixpoint. The probability that an 

e-mutation of N does not fulfill this property is at most (1 — Ce )" > 1 — nce "''"S"' = l — o(l). I 

The last two theorems prove the power of cyclic networks. They can robustly express many functions 
which are out of reach for cyclic networks. To exemplify this we note that Theorem 3.12 and respectively 
Theorem 3.12 respectively allow us to fix values of one or k -< logn variables. This allows us immediately 
to robustly express the dictator and fc-junta functions which where shown to be non robustly expressable by 
acyclic networks at the end of Section 3.2.4. Theorem 3.14 goes even further since it allows to fix variable 
values in hindsight. To demonstrate how big of an advantage this is, taking even s = 1 suffices. Here 



17 



Theorem 3.14 allows us to pick a variable, look at the development of all other values first and then choose 
how to set our variable. This allows us e.g. to robustly express parity and other more complicated functions 
robustly, even so their outcome depends on all variables. In fact while sublinear degree acyclic networks can 
not express most (randomly chosen) functions from Tn,p even if p is a constant (see Theorem 3.10), Theorem 
3.14 implies that with very high probability, a uniformly selected function from Tn.p is robustly expressibly 
by cyclic networks even with an exponentially low p > 2~°^^°s"\ This means that in nearly all environments 
— even with extremely sparse viable configurations - robust cyclic networks exist and work reliably. 

4 Conclusion and Future Work 

In the absence of networks, robustness can only be achieved if it is possible to set the gene expression levels 
such that nearly all direct changes on these levels are still viable. In different words, the static configuration 
of gene expression levels must be chosen such that a large fraction of the of the configuration's Hamming 
neighborhood is viable. What we have shown in this paper is that using boolean networks, instead of just 
static configurations, allows us to shape this neighborhood induced by random mutations. A single mutation 
to the genotype can cause the phcnotype configuration to change by a lot. The choice of the update functions 
in the boolean network defines now a differently shaped neighborhood of a phcnotype configuration. Studying 
acyclic networks, we showed that the neighborhood induced by the update functions of an acyclic network 
is a bijective transformation of the usual neighborhood in terms of Hamming distance. This means that all 
the randomness or variability introduced by nature is preserved but guided in the direction of still viable 
options. This is done by reshaping the neighborhood in such a way that most configurations in it are viable. 
The greater the degree of the network, the more "complex" the bijection is allowed to be and thus the greater 
possibility of robustly expressing objective functions. 

In contrast to this cyclic networks can make the volume of the phcnotype neighborhood smaller. That 
is, cyclic networks can compress different mutations in genotypic space into the same change in phenotypic 
space. This property of cyclic networks makes them useful for fixing the value of nodes in the network - 
providing stability. Intuitively, the reason cyclic networks can compress volumes in phenotypic space is that 
feedback loops provide nodes in the network the power to detect whether they have been mutated. Using this 
feedback behavior of cyclic dynamics allows dramatically more powerful transformations and concentrating 
probability mass in a smaller volume of phenotypcs helps to protect highly critical parts of an organism. 

Even though our work should be mostly understood as a first attempt to formalize and better understand 
the nature of robustness, it nevertheless suggests nontrivial predictions for biological systems. Our results 
in Section 3.3 indicate that cyclic networks should be more prevalent in biological systems where robustness 
is more important than variability. For example, we expect parts of an organism which are responsible for 
highly critical functions and survival to be regulated by self-reinforcing feedback loops. The prevalence of 
cyclic networks should also occur in general for organisms in rough environments allowing only specialized 
and well adjusted organisms to survive. On the contrary, in parts of a biological system where evolvability 
is highly desired, our results in Section 3.2 indicate that there will be more use of acyclic and pseudo-acyclic 
networks. This is for example the case in systems which are friendly in the sense that they allow many viable 
phenotypes but are at the same time rapidly changing. In such a situation the high evolvability of acyclic 
and pseudo-acyclic networks maximizes the chance to adjust to changes by guiding the full randomness via 
the control over their geometry in the phenotypic space, while their bad robustness behavior does not harm 
the development. 

Even though experimental research is still far away from determining the structure of large regulatory 
networks, it is conceivable that the understanding that has been developed for a number of concrete small 
biological systems will in the near future be available for a much larger class of regulatory networks. For the 
case that systematic experimental evaluation will be able to extract explicit objective functions one could 
check whether they are robustly expressible by our definition. Even more importantly, one could test the 
proposed model and its predictions by comparing the structure and robustness parameters of the networks 
our constructions gives (e.g. via the decision tree algorithm) with the actual networks occurring in nature. 
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Notwithstanding experimental validation, our work in this paper is significant because it gives a rigorous 
language to examine the robustness of the regulatory network. Even if different notions of mutation or 
different gcnotypc-phcnotypc maps are used in other models of the regulatory network, the basic questions 
asked in this paper are still relevant and we suspect that for most reasonable setups, the answers will be 
similar. Furthermore, the very idea that a robust network could be designed and proved to be robust (with 
respect to a given mutation model and genotype-phenotype map) was, as far as we know, not made explicit 
in previous work and is an important conceptual contribution of our work. This new line of research leaves 
open several unresolved questions which we discuss next. 

Future Work 

Beyond the results in this paper, we are interested in a better understanding of cyclic networks. How 
powerful are they in concentrating the results of mutations in a smaller volume of the phenotype space? 
Mathematically this question translates directly to the quest for upper bounds on the probability mass 
P{Fix{Ne,a) G S) that can be concentrated in an asymptotically small set 5*. We also would like to get 
more examples of classes of objective functions for which robust networks are efficiently constructiblc, like 
the acyclic networks for symmetric functions. 

In future work it would also be interesting to analyze alternate mutation models, including ones where 
the boolean network can be changed more drastically than here. Is it still possible to obtain any robustness 
in such scenarios? Can similar results, gaps and tradeoffs between, robustness, network degree and structure 
be made? 

Lastly we would like to advocate similar rigorous studies of other central ideas in biology. For example, it 
would be interesting to understand more precisely the ideas proposed in [Wag05] that robustness promotes 
evolvability. Can we develop a formal model where we show rigorously that robustness to one objective 
function helps a system find solutions to related objective functions? 
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A Classes of Robustly Expressible Functions 

Although Lemma 3.3 provides a tight condition for robust expressibility, it is not very natural, in the sense 
that the condition cannot be verified easily unless the function is presented in a particular form. In this 
section, we derive some more natural conditions that guarantee robust expressibility. 
We start with a useful definition. 

Definition A.l (Sequential Cover) A bipartite graph G =- {y\,V-2,,E) with \Vi\ = m and IV2I = n is 
sequentially coverable if there exists a sequence of vertices vi, . . . ,Vk G V2 for some k < n such that the 
following two conditions hold: 

(i) Every vertex v ^Vi is a neighbor of some Vi 

(ii) Let Go = G. For i € [k], inductively define Gi as the induced graph on Gi-i\{{vi\ U A/'(wi)). Each Vi 
is a vertex of degree exactly 1 in Gi-i. 

The sequence vi, . . . ,Vk is called a sequential cover of size k for G. 

A bipartite graph is thus sequentially coverable if the vertices of V2 can be ordered in such a way that 
at most one vertex of Vi is covered at a time. Note that m < n necessarily if the graph is sequentially 
coverable. Also, given a polynomial p : M" M, let Gp, the term-variable graph of p, be the bipartite graph 
with vertices for every variable and for every term, and with an edge between a variable vertex and a term 
vertex iff the variable occurs in the term. 
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Theorem A. 2 // a function f : {±1}" —f {±1} has a sign-representation sgn(p(.Ti, . . . , x„)), such that p{x) 
is a degree-d polynomial with constant coefficients and such that its term-variable bipartite graph, Gp, has a 
sequential cover of size ri(ri), then f is robustly expressible by a boolean network of degree d — 1. 

Proof: Let / be a function of the variables X = {xi, . . . , a;„}. We construct a boolean net TV that e-robustly 
expresses / for some constant positive e. Suppose that the term-variable graph Gp is sequentially covered 
by the sequence of variables Xi-^, . . . ,Xi^ G X, where each ij is a distinct element of [n]. Let Xi^^-^, . . . ,Xi^ 
denote the rest of the variables (in some arbitrary order) . For j € [k] , let Tj denote the unique term covered 
by the variable Xi^ . Observe that for j € [k], Tj can only contain the variables {xif}i>j and always contains 
Xi- . In the boolean network N, let the update function for the node associated with Xi- be Ui. = sgn{Tj/xi- ) 
for j G \k\ and let Ui . be an arbitrary element of {±1} for € {fc + 1, . . . , n). It is clear that N is an acyclic 
boolean network. 

We now show that / is e-robustly expressed by for some e e (0, i). Observe that with probability 
at least 1 — 2~^'"', at most 2en mutations occur. There arc a total of ri{n) terms in p. The terms which 
correspond to mutated nodes arc strictly negative, while those which arc not are strictly positive, because 
of our choice of update functions. Since all the coefficients of p are constant, for a small enough constant e, 
at most 2en mutations will not be enough to make p evaluate to a negative real. Hence, N expresses / with 
probability at least 1 - 2-"("). I 

We will say that a sign-representation is acyclic if the term-variable graph contains no cycle. This allows us 
to present a more natural class of functions that are robustly expressible. 

Theorem A. 3 // a boolean function f : {±1}" {il} has an acyclic constant- degree sign-representation 
with constant coefficients and no degree-1 terms, then f is robustly expressible by an acyclic boolean network 
of constant degree. 

Proof: We show that / has a sign-representation whose term-variable graph has a sequential cover of size 
^l{n), thus proving our claim using Theorem A. 2. Let G be the term- variable bipartite graph for the given 
sign-representation for /. Since G is a forest by assumption, there must exist some (at least 2) degree- 1 
vertices. Furthermore, because there are no degree-1 monomials in the sign-representation, all the degrce-1 
vertices represent variables, not terms. We construct S, a sequential cover of G, as follows. Initially S is 
empty. Select some degree-1 vertex v in G and append it to S. Next, remove v from G and also all the 
vertices adjacent to v. Note that these adjacent vertices must represent terms. The modified graph is still 
a forest and must have some degree-1 vertices. Again, the degree-1 vertices must represent variables, not 
terms. Hence, we can repeat the process, appending a degree-1 vertex to S, remove it and its adjacent 
vertices from G, and so on. We stop when no vertices remain that represent terms. 

It is clear that S" is a sequential cover. We only need to show that S is of size f2(n). This is so because 
each time a vertex is added to the sequential cover, we remove the unique term the associated variable is 
contained in, and this removal can make only a constant number of other vertices isolated (because each 
term is of constant degree). Hence, in order for all the variable vertices to either be in S or be isolated after 
the removal process, at least ri{n) vertices need to be in S. I 



B Follow-up to Corollary 3.8 

One can strengthen Corollary 3.8 so that the separation between the success probabilities of the optimal 
acyclic network and the optimal static assignment is 1 — 2^^^^"' (instead of 1 — 2^^'^^). The construction 
of the objective function in the following proof was suggested by Madhu Sudan. 

Corollary B.l There is a robustly expressible family of functions fn '■ {±1}" {±1} that cannot be robustly 
expressed by a static assignment. For any constant e > and for any static assignment of {xi, . . . , Xn}, the 
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probability that an e-mutation of the assignment satisfies fn is at most 2 ^(")^ while there is a (e, 1 — 2 ^("))- 
robust acyclic network with respect to fn- 

Proof: For each n > 1, consider the function /„ : {±1}" {±1} where fn{xi, ■ ■ ■ , Xn) = sgn(xi + xiX2 + 
X1X2X3 + • • • + X1X2 ■ ■ ■ Xn — j). Noting that the term-variable graph of p{x) = Xi + X1X2 + X1X2X3 + • • • + 
X1X2 ■ ■ ■ Xn is sequentially covered by the sequence a;„, . . . , xi, it follows by a probabilistic argument similar 
to the one in the proof of Theorem A. 2, that the fimction /„ is e-robustly expressible for a small enough 
constant e. 

On the other hand, we next show that /„ cannot be robustly expressed by any static assignment. Fix a 
static assignment for /„, and consider an e-mutation of it. Then, each Xi is an independent random variable 
that acquires —1/1 with probability 1 — e and 1/ — 1 with probability e. For i e {1, . . . , n}, let yi = X1X2 ■ • ■ Xi. 
Now, p{x) = J2i and therefore, | E [p(a;)]| = J2i l^Vil ^ X]i(l~2e)' < a constant. We need to bound 

the concentration around this mean. Note that the j/i's are not independent; instead, their generation process 
is exactly captured by the well studied memoryless Markov process, (for an introduction to Markov processes 
and eigenvalues see [MTH93]) The Markov process on hand is characterized by Pi'[yi ~ ai\yi^i ~ Oi-i] and 

is specified by a 2-by-2 stochastic matrix, either ^^^^ l^e)°''(^l^e ^e^)' "^^"^ eigenvalue 

gaps of these two matrices are 2e and 2(1 — e) respectively. By a concentration bound on the sum of 
elements generated by a Markov chain with eigenvalue gap 6, given in Theorem 4.23 of [DP08], we have that 
Pr [| E yi] - > < 2-"(''"). So, with probability at least 1 - 2-"("), X), Vi < n/4: and /„ is not 

satisfied. I 
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